A New Optimization Technique for the Inspector-Executor Method
نویسندگان
چکیده
This paper presents our HPF compiler using our modified inspector-executor method for implementing accesses to a distributed array. In our modified method, a compiler runs an inspector during compile time to obtain the information of data dependency among node processors, and it uses that information to optimize communication code included in the executor. This paper presents our idea, performance improvement shown by our prototype compiler, and limitations of our method.
منابع مشابه
A Fast Parallel Graph Partitioner for Shared-Memory Inspector/Executor Strategies
Graph partitioners play an important role in many parallel work distribution and locality optimization approaches. Surprisingly, however, to our knowledge there is no freely available parallel graph partitioner designed for execution on a shared memory multicore system. This paper presents a shared memory parallel graph partitioner, ParCubed, for use in the context of sparse tiling run-time dat...
متن کاملRun-Time Parallelization of Irregular DOACROSS Loops
Dependencies between iterations of loop structures cannot always be determined at compile-time because they may depend on input data which is known only at run-time. A prime example is a loop accessing an array where the array indices are themselves functions of another array determined only at run-time. To parallelize such loops, it is necessary to perform a run-time analysis. We describe a ne...
متن کاملAn Approach for Proving the Correctness of Inspector/Executor Transformations
To take advantage of multicore parallelism, programmers and compilers rewrite, or transform, programs to expose loop-level parallelism. Showing the correctness, or legality, of such program transformations enables their incorporation into compilers. However, the correctness of inspector/executor strategies, which develop parallel schedules at runtime for computations with nonaffine array access...
متن کاملRun-Time Parallelization for Loops
Current parallelizing compilers cannot extract a significant fraction of the available parallelism in a loop if it has a complex and/or statically insuficiently defined access pattern. In this paper, a run-time technique based on insp/exec scheme (inspector phase and executor phase) is proposed for finding parallelism on loops. Our inspector can determine the wavefronts of a loop with any compl...
متن کاملNew OpenMP directives for irregular data access loops
Many scientific applications involve array operations that are sparse in nature, ie array elements depend on the values of relatively few elements of the same or another array. When parallelised in the shared-memory model, there are often inter-thread dependencies which require that the individual array updates are protected in some way. Possible strategies include protecting all the updates, o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002